AITopics | Granger

Collaborating Authors

Granger

PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?

Zhou, Lingfeng, Zhang, Jialing, Gao, Jin, Jiang, Mohan, Wang, Dequan

arXiv.org Artificial IntelligenceAug-15-2025

Current role-play studies often rely on unvalidated LLM-as-a-judge paradigms, which may fail to reflect how humans perceive role fidelity. A key prerequisite for human-aligned evaluation is role identification, the ability to recognize who is speaking based on dialogue context. We argue that any meaningful judgment of role-playing quality (how well a character is played) fundamentally depends on first correctly attributing words and actions to the correct persona (who is speaking). We present PersonaEval, the first benchmark designed to test whether LLM evaluators can reliably identify human roles. PersonaEval uses human-authored dialogues from novels, scripts, and video transcripts, challenging models to determine the correct persona according to the conversation context. Our experiments, including a human study, show that even the best-performing LLMs reach only around 69% accuracy, well below the level needed for reliable evaluation. In contrast, human participants perform near ceiling with 90.8% accuracy, highlighting that current LLM evaluators are still not human enough to effectively judge role-play scenarios. To better understand this gap, we examine training-time adaptation and test-time compute, suggesting that reliable evaluation requires more than task-specific tuning, but depends on strong, human-like reasoning abilities in LLM evaluators. We release our benchmark at https://github.com/maple-zhou/PersonaEval.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.10014

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ALU: Agentic LLM Unlearning

Sanyal, Debdeep, Mandal, Murari

arXiv.org Artificial IntelligenceFeb-1-2025

Information removal or suppression in large language models (LLMs) is a desired functionality, useful in AI regulation, legal compliance, safety, and privacy. LLM unlearning methods aim to remove information on demand from LLMs. Current LLM unlearning methods struggle to balance the unlearning efficacy and utility due to the competing nature of these objectives. Keeping the unlearning process computationally feasible without assuming access to the model weights is an overlooked area. We present the first agentic LLM unlearning (ALU) method, a multi-agent, retrain-free, model-agnostic approach to LLM unlearning that achieves effective unlearning while preserving the utility. Our ALU framework unlearns by involving multiple LLM agents, each designed for a specific step in the unlearning process, without the need to update model weights for any of the agents in the framework. Users can easily request any set of unlearning instances in any sequence, and ALU seamlessly adapts in real time. This is facilitated without requiring any changes in the underlying LLM model. Through extensive experiments on established benchmarks (TOFU, WMDP, WPU) and jailbreaking techniques (many shot, target masking, other languages), we demonstrate that ALU consistently stands out as the most robust LLM unlearning framework among current state-of-the-art methods while incurring a low constant-time cost. We further highlight ALU's superior performance compared to existing methods when evaluated at scale. Specifically, ALU is assessed on up to 1000 unlearning targets, exceeding the evaluation scope of all previously proposed LLM unlearning methods.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.00406

Country:

Asia > India (0.04)
North America > United States > Indiana > Saint Joseph County > Granger (0.04)
North America > United States > California (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Algorithmic Bias in Multiclass CNN Classifications Using Causal Modeling

Byun, Min Sik, Hui, Wendy Wan Yee, Lau, Wai Kwong

arXiv.org Artificial IntelligenceJan-14-2025

This study describes a procedure for applying causal modeling to detect and mitigate algorithmic bias in a multiclass classification problem. The dataset was derived from the FairFace dataset, supplemented with emotional labels generated by the DeepFace pre-trained model. A custom Convolutional Neural Network (CNN) was developed, consisting of four convolutional blocks, followed by fully connected layers and dropout layers to mitigate overfitting. Gender bias was identified in the CNN model's classifications: Females were more likely to be classified as "happy" or "sad," while males were more likely to be classified as "neutral." To address this, the one-vs-all (OvA) technique was applied. A causal model was constructed for each emotion class to adjust the CNN model's predicted class probabilities. The adjusted probabilities for the various classes were then aggregated by selecting the class with the highest probability. The resulting debiased classifications demonstrated enhanced gender fairness across all classes, with negligible impact--or even a slight improvement--on overall accuracy. This study highlights that algorithmic fairness and accuracy are not necessarily trade-offs. All data and code for this study are publicly available for download.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.07885

Country:

Oceania > Australia > Western Australia (0.04)
North America > United States > Indiana > Saint Joseph County > Granger (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(4 more...)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Limitations and Prospects of Machine Unlearning for Generative AI

Zhou, Shiji, Wang, Lianzhe, Ye, Jiangnan, Wu, Yongliang, Chang, Heng

arXiv.org Artificial IntelligenceAug-1-2024

Generative AI (GenAI), which aims to synthesize realistic and diverse data samples from latent variables or other data modalities, has achieved remarkable results in various domains, such as natural language, images, audio, and graphs. However, they also pose challenges and risks to data privacy, security, and ethics. Machine unlearning is the process of removing or weakening the influence of specific data samples or features from a trained model, without affecting its performance on other data or tasks. While machine unlearning has shown significant efficacy in traditional machine learning tasks, it is still unclear if it could help GenAI become safer and aligned with human desire. To this end, this position paper provides an in-depth discussion of the machine unlearning approaches for GenAI. Firstly, we formulate the problem of machine unlearning tasks on GenAI and introduce the background. Subsequently, we systematically examine the limitations of machine unlearning on GenAI models by focusing on the two representative branches: LLMs and image generative (diffusion) models. Finally, we provide our prospects mainly from three aspects: benchmark, evaluation metrics, and utility-unlearning trade-off, and conscientiously advocate for the future development of this field.

arxiv preprint arxiv, generative model, unlearning, (13 more...)

arXiv.org Artificial Intelligence

2408.00376

Country:

North America > United States > Indiana > Saint Joseph County > Granger (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre:

Research Report > Promising Solution (0.46)
Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Hu, Shengyuan, Fu, Yiwei, Wu, Zhiwei Steven, Smith, Virginia

arXiv.org Artificial IntelligenceJun-19-2024

Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to reverse the effects of unlearning. We formalize this unlearning-relearning pipeline, explore the attack across three popular unlearning benchmarks, and discuss future directions and guidelines that result from our study.

information, knowledge, targeted relearning attack, (12 more...)

arXiv.org Artificial Intelligence

2406.13356

Country:

North America > United States > Indiana > Saint Joseph County > Granger (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.35)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

From Persona to Personalization: A Survey on Role-Playing Language Agents

Chen, Jiangjie, Wang, Xintao, Xu, Rui, Yuan, Siyu, Zhang, Yikai, Shi, Wei, Xie, Jian, Li, Shuang, Yang, Ruihan, Zhu, Tinghui, Chen, Aili, Li, Nianqi, Chen, Lida, Hu, Caiyu, Wu, Siye, Ren, Scott, Fu, Ziquan, Xiao, Yanghua

arXiv.org Artificial IntelligenceApr-28-2024

Recent advancements in large language models (LLMs) have significantly boosted the rise of Role-Playing Language Agents (RPLAs), i.e., specialized AI systems designed to simulate assigned personas. By harnessing multiple advanced abilities of LLMs, including in-context learning, instruction following, and social intelligence, RPLAs achieve a remarkable sense of human likeness and vivid role-playing performance. RPLAs can mimic a wide range of personas, ranging from historical figures and fictional characters to real-life individuals. Consequently, they have catalyzed numerous AI applications, such as emotional companions, interactive video games, personalized assistants and copilots, and digital clones. In this paper, we conduct a comprehensive survey of this field, illustrating the evolution and recent progress in RPLAs integrating with cutting-edge LLM technologies. We categorize personas into three types: 1) Demographic Persona, which leverages statistical stereotypes; 2) Character Persona, focused on well-established figures; and 3) Individualized Persona, customized through ongoing user interactions for personalized services. We begin by presenting a comprehensive overview of current methodologies for RPLAs, followed by the details for each persona type, covering corresponding data sourcing, agent construction, and evaluation. Afterward, we discuss the fundamental risks, existing limitations, and future prospects of RPLAs. Additionally, we provide a brief review of RPLAs in AI applications, which reflects practical user demands that shape and drive RPLA research. Through this work, we aim to establish a clear taxonomy of RPLA research and applications, and facilitate future research in this critical and ever-evolving field, and pave the way for a future where humans and RPLAs coexist in harmony.

arxiv preprint arxiv, language model, rpla, (14 more...)

arXiv.org Artificial Intelligence

2404.18231

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
(13 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media Language

Ding, Xiaohan, Carik, Buse, Gunturi, Uma Sushmitha, Reyna, Valerie, Rho, Eugenia H.

arXiv.org Artificial IntelligenceMar-1-2024

We introduce a multi-step reasoning framework using prompt-based LLMs to examine the relationship between social media language patterns and trends in national health outcomes. Grounded in fuzzy-trace theory, which emphasizes the importance of gists of causal coherence in effective health communication, we introduce Role-Based Incremental Coaching (RBIC), a prompt-based LLM framework, to identify gists at-scale. Using RBIC, we systematically extract gists from subreddit discussions opposing COVID-19 health measures (Study 1). We then track how these gists evolve across key events (Study 2) and assess their influence on online engagement (Study 3). Finally, we investigate how the volume of gists is associated with national health trends like vaccine uptake and hospitalizations (Study 4). Our work is the first to empirically link social media linguistic patterns to real-world public health trends, highlighting the potential of prompt-based LLMs in identifying critical online discussion patterns that can form the basis of public health communication strategies.

gist, health practice, proceedings, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642117

2403.00994

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Hawaii (0.05)
North America > United States > New York > New York County > New York City (0.05)
(23 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Manten, Georg, Casolo, Cecilia, Ferrucci, Emilio, Mogensen, Søren Wengel, Salvi, Cristopher, Kilbertus, Niki

arXiv.org Machine LearningFeb-28-2024

Inferring the causal structure underlying stochastic dynamical systems from observational data holds great promise in domains ranging from science and health to finance. Such processes can often be accurately modeled via stochastic differential equations (SDEs), which naturally imply causal relationships via "which variables enter the differential of which other variables". In this paper, we develop a kernel-based test of conditional independence (CI) on "path-space" -- solutions to SDEs -- by leveraging recent advances in signature kernels. We demonstrate strictly superior performance of our proposed CI test compared to existing approaches on path-space. Then, we develop constraint-based causal discovery algorithms for acyclic stochastic dynamical systems (allowing for loops) that leverage temporal information to recover the entire directed graph. Assuming faithfulness and a CI oracle, our algorithm is sound and complete. We empirically verify that our developed CI test in conjunction with the causal discovery algorithm reliably outperforms baselines across a range of settings.

causal discovery, graph, signature kernel, (11 more...)

arXiv.org Machine Learning

2402.18477

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Privacy Issues in Large Language Models: A Survey

Neel, Seth, Chang, Peter

arXiv.org Artificial IntelligenceJan-23-2024

This is the first survey of the active area of AI research that focuses on privacy issues in Large Language Models (LLMs). Specifically, we focus on work that red-teams models to highlight privacy risks, attempts to build privacy into the training or inference process, enables efficient data deletion from trained models to comply with existing privacy regulations, and tries to mitigate copyright issues. Our focus is on summarizing technical research that develops algorithms, proves theorems, and runs empirical evaluations. While there is an extensive body of legal and policy work addressing these challenges from a different angle, that is not the focus of our survey. Nevertheless, these works, along with recent legal developments do inform how these technical problems are formalized, and so we discuss them briefly in Section 1. While we have made our best effort to include all the relevant work, due to the fast moving nature of this research we may have missed some recent work. If we have missed some of your work please contact us, as we will attempt to keep this survey relatively up to date. We are maintaining a repository with the list of papers covered in this survey and any relevant code that was publicly available at https://github.com/safr-ml-lab/survey-llm.

dataset, language model, memorization, (15 more...)

arXiv.org Artificial Intelligence

2312.06717

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(27 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Agent based network modelling of COVID-19 disease dynamics and vaccination uptake in a New South Wales Country Township

Hin, Shing, Yeung, null, Piraveenan, Mahendra

arXiv.org Artificial IntelligenceJan-7-2024

We employ an agent-based contact network model to study the relationship between vaccine uptake and disease dynamics in a hypothetical country town from New South Wales, Australia, undergoing a COVID-19 epidemic, over a period of three years. We model the contact network in this hypothetical township of N = 10000 people as a scale-free network, and simulate the spread of COVID-19 and vaccination program using disease and vaccination uptake parameters typically observed in such a NSW town. We simulate the spread of the ancestral variant of COVID-19 in this town, and study the disease dynamics while the town maintains limited but non-negligible contact with the rest of the country which is assumed to be undergoing a severe COVID-19 epidemic. We also simulate a maximum three doses of Pfizer Comirnaty vaccine being administered in this town, with limited vaccine supply at first which gradually increases, and analyse how the vaccination uptake affects the disease dynamics in this town, which is captured using an extended compartmental model with epidemic parameters typical for a COVID-19 epidemic in Australia. Our results show that, in such a township, three vaccination doses are sufficient to contain but not eradicate COVID-19, and the disease essentially becomes endemic. We also show that the average degree of infected nodes (the average number of contacts for infected people) predicts the proportion of infected people. Therefore, if the hubs (people with a relatively high number of contacts) are disproportionately infected, this indicates an oncoming peak of the infection, though the lag time thereof depends on the maximum number of vaccines administered to the populace. Overall, our analysis provides interesting insights in understanding the interplay between network topology, vaccination levels, and COVID-19 disease dynamics in a typical remote NSW country town.

immunity, vaccination, vaccine, (17 more...)

arXiv.org Artificial Intelligence

2401.0361

Country:

Oceania > Australia > New South Wales (0.60)
North America > United States > Indiana > Saint Joseph County > Granger (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)
Information Technology > Communications > Networks (0.54)

Add feedback